Fix SanitizeBoundingBoxes Handling of Semantic Masks #9256

zy1git · 2025-10-31T19:37:15Z

Summary:
Background
Currently, torchvision.transforms.v2.SanitizeBoundingBoxes fails when used inside a v2.Compose that receives both bounding boxes and a semantic segmentation mask as inputs. The transform attempts to apply a per-box boolean validity mask to all tv_tensors.Mask objects, including semantic masks (shape [H, W]), resulting in a shape mismatch and a crash.
Error Example:
IndexError: The shape of the mask [3] at index 0 does not match the shape of the indexed tensor [1080, 1920] at index 0
Expected Behavior
The transform should only sanitize masks that have a 1:1 mapping with bounding boxes (e.g., per-instance masks).
Semantic masks (2D, shape [H, W]) should be passed through unchanged.
Task Objectives
Update SanitizeBoundingBoxes Logic:
Detect whether a tv_tensors.Mask is a per-instance mask (shape [N, H, W] or [N, ...] where N == num_boxes) or a semantic mask (shape [H, W]).
Only apply the per-box validity mask to per-instance masks.
Pass through semantic masks unchanged.
If a mask does not match the number of boxes, do not raise an error; instead, pass it through.
Optionally, log a warning if a mask is skipped for sanitization due to shape mismatch.
Clarify Documentation:
Update the docstring for SanitizeBoundingBoxes to explicitly state:
Only per-instance masks are sanitized.
Semantic masks are passed through unchanged.
The transform does not require users to pass masks to labels_getter for them to be sanitized.
Add examples for both use cases (per-instance and semantic masks).
Add/Update Unit Tests:
Test with both per-instance masks and semantic masks in a v2.Compose.
Ensure semantic masks are not sanitized and do not cause errors.
Ensure per-instance masks are sanitized correctly.
This can be added in TestSanitizeBoundingBoxes
Backward Compatibility:
Ensure that the change does not break existing datasets or user code that relies on current behavior.
Finally submit a PR with the changes and link the issue in the description.

Differential Revision: D85840801

pytorch-bot · 2025-10-31T19:37:19Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9256

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 8940dc3 with merge base ca22124 ():

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Build Linux Wheels / pytorch/vision / build-manywheel-py3_10-xpu (gh) (trunk failure)
##[error]The operation was canceled.
Build Linux Wheels / pytorch/vision / upload / upload-manywheel-py3_10-xpu (gh) (trunk failure)
Unable to download artifact(s): Artifact not found for name: pytorch_vision__3.10_xpu_x86_64

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-31T19:37:23Z

@zy1git has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85840801.

Summary: Background Currently, torchvision.transforms.v2.SanitizeBoundingBoxes fails when used inside a v2.Compose that receives both bounding boxes and a semantic segmentation mask as inputs. The transform attempts to apply a per-box boolean validity mask to all tv_tensors.Mask objects, including semantic masks (shape [H, W]), resulting in a shape mismatch and a crash. Error Example: IndexError: The shape of the mask [3] at index 0 does not match the shape of the indexed tensor [1080, 1920] at index 0 Expected Behavior The transform should only sanitize masks that have a 1:1 mapping with bounding boxes (e.g., per-instance masks). Semantic masks (2D, shape [H, W]) should be passed through unchanged. Task Objectives Update SanitizeBoundingBoxes Logic: Detect whether a tv_tensors.Mask is a per-instance mask (shape [N, H, W] or [N, ...] where N == num_boxes) or a semantic mask (shape [H, W]). Only apply the per-box validity mask to per-instance masks. Pass through semantic masks unchanged. If a mask does not match the number of boxes, do not raise an error; instead, pass it through. Optionally, log a warning if a mask is skipped for sanitization due to shape mismatch. Clarify Documentation: Update the docstring for SanitizeBoundingBoxes to explicitly state: Only per-instance masks are sanitized. Semantic masks are passed through unchanged. The transform does not require users to pass masks to labels_getter for them to be sanitized. Add examples for both use cases (per-instance and semantic masks). Add/Update Unit Tests: Test with both per-instance masks and semantic masks in a v2.Compose. Ensure semantic masks are not sanitized and do not cause errors. Ensure per-instance masks are sanitized correctly. This can be added in TestSanitizeBoundingBoxes Backward Compatibility: Ensure that the change does not break existing datasets or user code that relies on current behavior. Finally submit a PR with the changes and link the issue in the description. Differential Revision: D85840801

Summary: Background Currently, torchvision.transforms.v2.SanitizeBoundingBoxes fails when used inside a v2.Compose that receives both bounding boxes and a semantic segmentation mask as inputs. The transform attempts to apply a per-box boolean validity mask to all tv_tensors.Mask objects, including semantic masks (shape [H, W]), resulting in a shape mismatch and a crash. Error Example: IndexError: The shape of the mask [3] at index 0 does not match the shape of the indexed tensor [1080, 1920] at index 0 Expected Behavior The transform should only sanitize masks that have a 1:1 mapping with bounding boxes (e.g., per-instance masks). Semantic masks (2D, shape [H, W]) should be passed through unchanged. Task Objectives Update SanitizeBoundingBoxes Logic: Detect whether a tv_tensors.Mask is a per-instance mask (shape [N, H, W] or [N, ...] where N == num_boxes) or a semantic mask (shape [H, W]). Only apply the per-box validity mask to per-instance masks. Pass through semantic masks unchanged. If a mask does not match the number of boxes, do not raise an error; instead, pass it through. Optionally, log a warning if a mask is skipped for sanitization due to shape mismatch. Clarify Documentation: Update the docstring for SanitizeBoundingBoxes to explicitly state: Only per-instance masks are sanitized. Semantic masks are passed through unchanged. The transform does not require users to pass masks to labels_getter for them to be sanitized. Add/Update Unit Tests: Test with both per-instance masks and semantic masks in a v2.Compose. Ensure semantic masks are not sanitized and do not cause errors. Ensure per-instance masks are sanitized correctly. This can be added in TestSanitizeBoundingBoxes Backward Compatibility: Ensure that the change does not break existing datasets or user code that relies on current behavior. Finally submit a PR with the changes and link the issue in the description. Differential Revision: D85840801

NicolasHug

Nice work @zy1git , thanks for the PR!

Summary: Background Currently, torchvision.transforms.v2.SanitizeBoundingBoxes fails when used inside a v2.Compose that receives both bounding boxes and a semantic segmentation mask as inputs. The transform attempts to apply a per-box boolean validity mask to all tv_tensors.Mask objects, including semantic masks (shape [H, W]), resulting in a shape mismatch and a crash. Error Example: IndexError: The shape of the mask [3] at index 0 does not match the shape of the indexed tensor [1080, 1920] at index 0 Expected Behavior The transform should only sanitize masks that have a 1:1 mapping with bounding boxes (e.g., per-instance masks). Semantic masks (2D, shape [H, W]) should be passed through unchanged. Task Objectives Update SanitizeBoundingBoxes Logic: Detect whether a tv_tensors.Mask is a per-instance mask (shape [N, H, W] or [N, ...] where N == num_boxes) or a semantic mask (shape [H, W]). Only apply the per-box validity mask to per-instance masks. Pass through semantic masks unchanged. If a mask does not match the number of boxes, do not raise an error; instead, pass it through. Optionally, log a warning if a mask is skipped for sanitization due to shape mismatch. Clarify Documentation: Update the docstring for SanitizeBoundingBoxes to explicitly state: Only per-instance masks are sanitized. Semantic masks are passed through unchanged. The transform does not require users to pass masks to labels_getter for them to be sanitized. Add/Update Unit Tests: Test with both per-instance masks and semantic masks in a v2.Compose. Ensure semantic masks are not sanitized and do not cause errors. Ensure per-instance masks are sanitized correctly. This can be added in TestSanitizeBoundingBoxes Backward Compatibility: Ensure that the change does not break existing datasets or user code that relies on current behavior. Finally submit a PR with the changes and link the issue in the description. Differential Revision: D85840801

meta-cla bot added the cla signed label Oct 31, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 31, 2025

zy1git force-pushed the export-D85840801 branch from b677a61 to 62d317d Compare October 31, 2025 19:57

zy1git force-pushed the export-D85840801 branch from 62d317d to 65b5b53 Compare November 4, 2025 00:27

NicolasHug approved these changes Nov 4, 2025

View reviewed changes

zy1git force-pushed the export-D85840801 branch from d723fa1 to 8940dc3 Compare November 5, 2025 16:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix SanitizeBoundingBoxes Handling of Semantic Masks #9256

Fix SanitizeBoundingBoxes Handling of Semantic Masks #9256

zy1git commented Oct 31, 2025

Uh oh!

pytorch-bot bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 31, 2025

Uh oh!

NicolasHug left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix SanitizeBoundingBoxes Handling of Semantic Masks #9256

Are you sure you want to change the base?

Fix SanitizeBoundingBoxes Handling of Semantic Masks #9256

Conversation

zy1git commented Oct 31, 2025

Uh oh!

pytorch-bot bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9256

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

meta-codesync bot commented Oct 31, 2025

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Oct 31, 2025 •

edited

Loading